Skip to content

Improve LDAP desktop discovery#30277

Merged
zmb3 merged 1 commit intomasterfrom
zmb3/ldap-discovery
Aug 10, 2023
Merged

Improve LDAP desktop discovery#30277
zmb3 merged 1 commit intomasterfrom
zmb3/ldap-discovery

Conversation

@zmb3
Copy link
Copy Markdown
Collaborator

@zmb3 zmb3 commented Aug 10, 2023

In environments with many desktops where some DNS queries time out, we can run in to situations where the discovery loop doesn't complete before the desktop's 10m TTL expires, causing desktops to disappear and reappear sporadically.

  • Increase the TTL for LDAP-discovered desktops to 30m. This won't harm UX, since desktops that stop being discovered are already purged prior to their expiration.
  • Perform both DNS queries (via the default resolver and by hitting the LDAP server directly) in parallel. This reduces the maximum time a single lookup can take from 20s to 5s.

@zmb3 zmb3 requested review from ibeckermayer and removed request for codingllama and kimlisa August 10, 2023 16:02
Copy link
Copy Markdown
Contributor

@Tener Tener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general.

One thing to note is that we are changing the effective timeouts here. Previously, the system DNS had 10 seconds of timeout, the LDPA one had... unlimited? In any case tctx was only used for the former, the latter used parent ctx. Now we trim those 10 seconds to 5, and make the LDAP query run in 5 seconds flat. (Contrast with 20 seconds we have seen in the logs today.).

I think this is a good change anyway, just highlighting this to you in case this is relevant.

Comment thread lib/srv/desktop/discovery.go Outdated
Comment thread lib/srv/desktop/discovery.go Outdated
Comment thread lib/srv/desktop/discovery.go Outdated
In environments with many desktops where some DNS queries time out,
we can run in to situations where the discovery loop doesn't complete
before the desktop's 10m TTL expires, causing desktops to disappear
and reappear sporadically.

- Increase the TTL for LDAP-discovered desktops to 30m. This won't
  harm UX, since desktops that stop being discovered are already
  purged prior to their expiration.
- Perform both DNS queries (via the default resolver and by hitting
  the LDAP server directly) in parallel. This reduces the maximum
  time a single lookup can take from 20s to 5s.
@zmb3 zmb3 force-pushed the zmb3/ldap-discovery branch from eb5ba12 to c26e384 Compare August 10, 2023 21:47
@zmb3 zmb3 enabled auto-merge August 10, 2023 21:48
@zmb3 zmb3 added this pull request to the merge queue Aug 10, 2023
Merged via the queue into master with commit e645977 Aug 10, 2023
@zmb3 zmb3 deleted the zmb3/ldap-discovery branch August 10, 2023 22:22
@public-teleport-github-review-bot
Copy link
Copy Markdown

@zmb3 See the table below for backport results.

Branch Result
branch/v11 Failed
branch/v12 Failed
branch/v13 Failed

zmb3 added a commit that referenced this pull request Aug 24, 2023
In #30277 we made the discovery process make two DNS queries
in parallel. There was an error in this logic - since we don't
write to the channel on error, if both queries fail we end up
waiting a full 5 seconds even if the failures occur much quicker
than that.
github-merge-queue Bot pushed a commit that referenced this pull request Aug 25, 2023
In #30277 we made the discovery process make two DNS queries
in parallel. There was an error in this logic - since we don't
write to the channel on error, if both queries fail we end up
waiting a full 5 seconds even if the failures occur much quicker
than that.
github-actions Bot pushed a commit that referenced this pull request Aug 25, 2023
In #30277 we made the discovery process make two DNS queries
in parallel. There was an error in this logic - since we don't
write to the channel on error, if both queries fail we end up
waiting a full 5 seconds even if the failures occur much quicker
than that.
github-actions Bot pushed a commit that referenced this pull request Aug 25, 2023
In #30277 we made the discovery process make two DNS queries
in parallel. There was an error in this logic - since we don't
write to the channel on error, if both queries fail we end up
waiting a full 5 seconds even if the failures occur much quicker
than that.
github-actions Bot pushed a commit that referenced this pull request Aug 25, 2023
In #30277 we made the discovery process make two DNS queries
in parallel. There was an error in this logic - since we don't
write to the channel on error, if both queries fail we end up
waiting a full 5 seconds even if the failures occur much quicker
than that.
github-merge-queue Bot pushed a commit that referenced this pull request Aug 25, 2023
In #30277 we made the discovery process make two DNS queries
in parallel. There was an error in this logic - since we don't
write to the channel on error, if both queries fail we end up
waiting a full 5 seconds even if the failures occur much quicker
than that.
github-merge-queue Bot pushed a commit that referenced this pull request Aug 25, 2023
In #30277 we made the discovery process make two DNS queries
in parallel. There was an error in this logic - since we don't
write to the channel on error, if both queries fail we end up
waiting a full 5 seconds even if the failures occur much quicker
than that.
github-merge-queue Bot pushed a commit that referenced this pull request Aug 25, 2023
In #30277 we made the discovery process make two DNS queries
in parallel. There was an error in this logic - since we don't
write to the channel on error, if both queries fail we end up
waiting a full 5 seconds even if the failures occur much quicker
than that.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants